Measures of Central Tendency -- Mean, Median, Mode
Location of Data plays an important role in the study of statistics. When we take the achievement scores of the student of a class and arrange them in a frequency distribution, we can easily find that they are very low. The marks of the most students lie somewhere between the highest and lowest scores of the whole class. This tendency of a group of distribution is named as Central Tendency and typical score lying between the extreme and shared by most of the students is referred to as a measure of Central Tendency.
In this way a measure of Central Tendency as Tata defines “ is a short of average of typical value of typical value of the items in the series and its function is to summarize the series in terms of this average value.
The following are the most common measures of Central Tendency used in statistics
1. Arithmetic mean or mean
2. Median
3. Mode
Each of the measure of Central Tendency in its own way can be called a representative or the characteristics of the group as a whole can be described by a single value which each of these measures gives.
Arithmetic Average or Mean (M)
Mean is the average score contained in a set of scores. For example if the marks of 5 students in a test of Arithmetic be 40, 35, 65, 70 and 55 respectively, the mean score is
Where, X is the individual score
N is the number of cases in the group Σ is read as sigma which means the sum of
From the above calculation of mean it is known that the students, securing 40 and 35 are below Average and the students securing 60, 70 and 55 are above average students
Example
Now according to formula M = AM + Σ fd/N x i Where AM = Assumed Mean D = deviation in terms of class intervals from the interval in which the AM lies I = size of class interval N = Total number of cases M = 47 + -20/40 x 5 = 47 + (-2.5) = 44.5
Merits of Arithmetic Mean
It is rigidly defined. Its value is always definite
It can be accurately determined with the help of various methods.
It is simple and easily understandable
If the number of items in a series is large, the mean provides a good basis of comparison.
A mean has an important property and that is the sum of the derivation of all scores from the mean is always zero. In this respect, the median does not qualify.
Limitation
As the skew ness of the distribution increase, the reliability of the mean decreases.
A mean gives greater importance to the bigger items of a series and less importance to the small items. One big item among five, four of which are small, will push up the mean considerably. Thus the mean has upward bias . But the average is not true. if in a series of five items, four of which are big and one item is small, the mean will not pulled down very much.
A mean can be used only with distribution that gives absolute scores. It cannot be used with scores expressing grades .
A mean sometimes gives such results as appear almost absurd eg. 3-4 children.
Since it is calculated from all the items it may considerably affect the mean , particularly when the number of items is not large. For example , the mean of Rs. 275 is not at all a representative fgure of Rs. 1000, Rs. 25 , Rs. 35 and Rs. 40 .
Unless the data are very simple , the mean cannot be located merely by inspection , while the median and the mode can be .
If a single item of a series is missing the mean cannot be calculated.
Sometimes the mean gives Fallacious conclusion.
eg. Income of two groups
A Group : Rs. 1000, Rs. 100, Rs. 75 , Rs.25 ,
B Group : Rs. 325, Rs. 285 , Rs.285 , Rs. 290
For both group, the mean is Rs. 300 it would appear that the groups are economically at the same level but this is not really so.
The Median ( Mdn )
Median is derived from the latin word “Medianus” which means Middle. Thus median refers to an item which lies at the middle. Hence it is a positional value . The term position means a place where the value exists or lies . Median is such a value whish lies exactly at the middle of distribution or scores above which 50% of the larger items and 50% of the smaller items exists. Thus it is the point which spilts the distribution into two halves.
Example – Suppose in a particular class fifteen students secure following marks as 41,44,42,51,40,30,39,47,45,49,52,54,48,50,36, . if we want to find out the Median Value then we should arrange the score positionally i.e in the order of magnitude . We may arrange them either in ascending order or descending order. Then we should work out the Middle most position applying the formula.
Mark the position of the median and value from the illustration below
Position
Scores in Descending order
Position
Scores in Ascending order
1
54
1
30
2
52
2
36
3
51
3
39
4
50
4
40
5
49
5
41
6
48
6
42
7
47
7
44
8
45 The Median
8
45 The Median
9
44
9
47
10
42
10
48
11
41
11
49
12
40
12
50
13
39
13
51
14
36
14
52
15
30
15
54
(a)Where N is ODD
Example – find the value of median from the following observations that as from the Marks of 5 students as 60, 56,52,65,63, I. Arrange the scores in order of size as 52, 56, 60, 61, 63 ( The scores may be arranged in ascending or descending order , Here we arrange the scores in ascending order)
II. Add 1 to the number of items say ‘5’ and divide the sum by ‘2’ to get the position of the median
(b)Where N is EVEN
Where the number of items is even we apply the same formula to locate the position of Median. Since we get a fractional value we should try to adjust the value of Median in between two items
Example - Find the value of median from the following scores 7,9,3,5,4,8
Merits of the Median
(1) It is an ideal average, for it is rigidly defined.
(2) It is easily understood without any difficulty.
(3) It is not affected by the values of the extreme items (i.e. skewness) and as such is sometimes more representative than the mean.
If the income of five persons are Rs. 300, Rs 400, Rs 450 where as the mean value would be rs 2,300. The median in such cases is a better average.
(4) In the case of an open end table, where the values of the extreme are not known, the median can be calculated of the number of items is known.
(5) The Median never gives absurd or fallacious results.
(6) It can be located in many cases by inspection.
Limitations
I. It cannot be used for computing other statistical measures such as S.D. or co-efficient of correlation.
II. The arrangement of items in the ascending or descending order is sometimes very tedious.
III. When there are wide variations between the values of different items , a median may not be representative of series.
Eg. Marks 15,16,16,18,20,54,60,70,70,82
The median is 20 which is not representative of the series.
IV. If a big or small item in series in series are to be receive greater importance the median would be unsuitable average for it ignores the values of extreme items.
V. A median is more likely to affected by fluctuation of sampling than the mean.
MODE (Mo)
Mode is defined to be the size of the variable which occurs most frequently. It is the point on the score scales that corresponding to the maximum frequency of the distribution. In any scene is the value of the item which is most characteristic or common and is usually repeated maximum number of times. Mode = 3 Median – 2 Mean Mode can also be calculated by the formula Mode = l + { fx / fx + fx’} x s Where l = lower limit of the class interval having the highest frequency fx = frequency of the class interval above it fx’ = frequency of the class interval below it s = size of the class interval
Merits of Mode
I. It posses the merit of simplicity in a direct series, the mode can be located even by inspection.
II. A mode is not affected by the values of extreme items; provide they adhere to the natural law relating to extremes.
III. For the determination of a mode, it is not necessary to know the values of all the items of series.
Limitations__
I. A mode is ill detailed, intermediate and indefinite and so untrustworthy.
II. It is not capable of further mathematical treatment.
III. It is not based on all the observations of a series.
IV. It may be unrepresentative in many cases. A slight change in the series may extensively disturb distributions.
V. In ungrouped data, if no. scores is repeated, it may lead to the wrong conclusion that the wrong conclusion that the series has no mode.
Normal Probability Curve (NPC)
It is observed that in a certain test, maximum number of the students secures average marks. A few students secure high and low marks. If we represent this distribution graphically, it is technically named as normal curve. Literally normal means average
In a normal distribution, the scores are mathematically specified and distortion around the average.
Demoivre (1733) first developed the equation of the curve. This concept was further developed by Gauss and Laplace.
The normal distribution may be represented by a simple Mathematical example.
If we toss ten coins for several times the maximum number of combination of results could be 5 tails and 5 heads . The other possible combination may be 10 heads and no tails. 10 Tails and no heads, which are called extreme combination loke 6 heads – 4 tails , 7heads – 3 tails , and vise versa would be plenty. If the frequencies with which each combination appears are plotted on a graph approximately normal distribution will be obtained.
The Characteristics of the normal curve or normal probability curve is as follows.
1) Bell Shaped Curve: The Shape of normal curve is like that of a bell
2) Same numerical value of Mean, Median and Mode; for this curve the Mean Median Mode carry the equal value. Therefore they fall at the same point on the curve.
3) Perfectly Symmetrical: the normal curve is perfectly symmetrical by nature. That means the curve inclines towards both sides equally from the centre of the curve
4) The curve is not skewed, therefore the value of the measure of skew ness is Zero
5) Asymptotic characteristic of the curve ; the normal curve does not touch thebase on OX axis on both the sides . Thus it extends from negative infinty (-∞) to positive (+∞) .
6) Asymptotic characteristic of the curve ; the normal curve does not touch thebase on OX axis on both the sides . Thus it extends from negative infinty (-∞) to positive (+∞) .
7) Distance of the curve ; For Practical purpose the base line of the curve is devided into Six Sigma distance from -3σ to +3 σ . Most of the case i.e. 99.73 % are covered within such distance.
8) The curve is unimodal: since the Mean , Median and Mode lie at one point of the curve it is unimodal in nature
Significance of Normal Probability Curve
Significance of Normal Probability Curve is as under;
1) Normal curve is very helpful in educational evaluation and measurement as it provides relative position of individual in a group.
2) Normal curve can be used as a scale of measurement in behavioral science.
Correlation
Simpson & Kafka – “Correlation analysis deals with the association between two or more variables
Croxton & Cowden – “When the relationship is of a quantitative nature the appropriate statistical tool for discovering and measuring the relationship and expressing it in brief formula is known as correlation.
Thus correlation is the relationship between two variables in which change of one variable affects the other variables. The change may be towards positive or negative direction. When price rise demand falls. Here rise in price causes decrease in demand. Here the relationship between two variables is negative.
When population rate increases the rate of expenditure also increases. Here the relationship between the above variable is positive which is otherwise known as positive correlation.
Where change is one variable least affects change in other variable there the relationship between the variables categorized under zero relation. Growth of a child least affects the number of hands or fingers. Here the member of fingers has zero relationship with the growth of a child. Similarly increase in number of student least affects the number or quality of the contents in the courses of study. Here the quality of the course and the number of students are negatively correlated. -1 --- Highly negative correlation +1 --- Highly Positive correlation Example: Calculation ofρ (rho) when no ties exist
Measures of Central Tendency -- Mean, Median, Mode
Location of Data plays an important role in the study of statistics. When we take the achievement scores of the student of a class and arrange them in a frequency distribution, we can easily find that they are very low. The marks of the most students lie somewhere between the highest and lowest scores of the whole class. This tendency of a group of distribution is named as Central Tendency and typical score lying between the extreme and shared by most of the students is referred to as a measure of Central Tendency.
In this way a measure of Central Tendency as Tata defines “ is a short of average of typical value of typical value of the items in the series and its function is to summarize the series in terms of this average value.
The following are the most common measures of Central Tendency used in statistics
1. Arithmetic mean or mean
2. Median
3. Mode
Each of the measure of Central Tendency in its own way can be called a representative or the characteristics of the group as a whole can be described by a single value which each of these measures gives.
Arithmetic Average or Mean (M)
Mean is the average score contained in a set of scores. For example if the marks of 5 students in a test of Arithmetic be 40, 35, 65, 70 and 55 respectively, the mean score is
Where,
X is the individual score
N is the number of cases in the group
Σ is read as sigma which means the sum of
From the above calculation of mean it is known that the students, securing 40 and 35 are below Average and the students securing 60, 70 and 55 are above average students
Example
Now according to formula
M = AM + Σ fd/N x i
Where AM = Assumed Mean
D = deviation in terms of class intervals
from the interval in which the AM lies I = size of class interval
N = Total number of cases
M = 47 + -20/40 x 5
= 47 + (-2.5)
= 44.5
Merits of Arithmetic Mean
Limitation
- As the skew ness of the distribution increase, the reliability of the mean decreases.
- A mean gives greater importance to the bigger items of a series and less importance to the small items. One big item among five, four of which are small, will push up the mean considerably. Thus the mean has upward bias . But the average is not true. if in a series of five items, four of which are big and one item is small, the mean will not pulled down very much.
- A mean can be used only with distribution that gives absolute scores. It cannot be used with scores expressing grades .
- A mean sometimes gives such results as appear almost absurd eg. 3-4 children.
- Since it is calculated from all the items it may considerably affect the mean , particularly when the number of items is not large. For example , the mean of Rs. 275 is not at all a representative fgure of Rs. 1000, Rs. 25 , Rs. 35 and Rs. 40 .
- Unless the data are very simple , the mean cannot be located merely by inspection , while the median and the mode can be .
- If a single item of a series is missing the mean cannot be calculated.
- Sometimes the mean gives Fallacious conclusion.
eg. Income of two groupsA Group : Rs. 1000, Rs. 100, Rs. 75 , Rs.25 ,
B Group : Rs. 325, Rs. 285 , Rs.285 , Rs. 290
For both group, the mean is Rs. 300 it would appear that the groups are economically at the same level but this is not really so.
The Median ( Mdn )
Median is derived from the latin word “Medianus” which means Middle. Thus median refers to an item which lies at the middle. Hence it is a positional value . The term position means a place where the value exists or lies . Median is such a value whish lies exactly at the middle of distribution or scores above which 50% of the larger items and 50% of the smaller items exists. Thus it is the point which spilts the distribution into two halves.
Example – Suppose in a particular class fifteen students secure following marks as 41,44,42,51,40,30,39,47,45,49,52,54,48,50,36, . if we want to find out the Median Value then we should arrange the score positionally i.e in the order of magnitude . We may arrange them either in ascending order or descending order. Then we should work out the Middle most position applying the formula.
Mark the position of the median and value from the illustration below
The Median
The Median
(a) Where N is ODD
Example – find the value of median from the following observations that as from the Marks of 5 students as 60, 56,52,65,63,
I. Arrange the scores in order of size as 52, 56, 60, 61, 63 ( The scores may be arranged in ascending or descending order , Here we arrange the scores in ascending order)
II. Add 1 to the number of items say ‘5’ and divide the sum by ‘2’ to get the position of the median
(b) Where N is EVEN
Where the number of items is even we apply the same formula to locate the position of Median. Since we get a fractional value we should try to adjust the value of Median in between two items
Example - Find the value of median from the following scores 7,9,3,5,4,8
Merits of the Median
(1) It is an ideal average, for it is rigidly defined.
(2) It is easily understood without any difficulty.
(3) It is not affected by the values of the extreme items (i.e. skewness) and as such is sometimes more representative than the mean.
If the income of five persons are Rs. 300, Rs 400, Rs 450 where as the mean value would be rs 2,300. The median in such cases is a better average.
(4) In the case of an open end table, where the values of the extreme are not known, the median can be calculated of the number of items is known.
(5) The Median never gives absurd or fallacious results.
(6) It can be located in many cases by inspection.
Limitations
I. It cannot be used for computing other statistical measures such as S.D. or co-efficient of correlation.
II. The arrangement of items in the ascending or descending order is sometimes very tedious.
III. When there are wide variations between the values of different items , a median may not be representative of series.
Eg. Marks 15,16,16,18,20,54,60,70,70,82
The median is 20 which is not representative of the series.
IV. If a big or small item in series in series are to be receive greater importance the median would be unsuitable average for it ignores the values of extreme items.
V. A median is more likely to affected by fluctuation of sampling than the mean.
MODE (Mo)
Mode is defined to be the size of the variable which occurs most frequently. It is the point on the score scales that corresponding to the maximum frequency of the distribution. In any scene is the value of the item which is most characteristic or common and is usually repeated maximum number of times.
Mode = 3 Median – 2 Mean
Mode can also be calculated by the formula
Mode = l + { fx / fx + fx’} x s
Where l = lower limit of the class interval having the highest frequency
fx = frequency of the class interval above it
fx’ = frequency of the class interval below it
s = size of the class interval
Merits of Mode
I. It posses the merit of simplicity in a direct series, the mode can be located even by inspection.
II. A mode is not affected by the values of extreme items; provide they adhere to the natural law relating to extremes.
III. For the determination of a mode, it is not necessary to know the values of all the items of series.
Limitations__
I. A mode is ill detailed, intermediate and indefinite and so untrustworthy.
II. It is not capable of further mathematical treatment.
III. It is not based on all the observations of a series.
IV. It may be unrepresentative in many cases. A slight change in the series may extensively disturb distributions.
V. In ungrouped data, if no. scores is repeated, it may lead to the wrong conclusion that the wrong conclusion that the series has no mode.
Normal Probability Curve (NPC)
It is observed that in a certain test, maximum number of the students secures average marks. A few students secure high and low marks. If we represent this distribution graphically, it is technically named as normal curve. Literally normal means average
In a normal distribution, the scores are mathematically specified and distortion around the average.
Demoivre (1733) first developed the equation of the curve. This concept was further developed by Gauss and Laplace.
The normal distribution may be represented by a simple Mathematical example.
If we toss ten coins for several times the maximum number of combination of results could be 5 tails and 5 heads . The other possible combination may be 10 heads and no tails. 10 Tails and no heads, which are called extreme combination loke 6 heads – 4 tails , 7heads – 3 tails , and vise versa would be plenty. If the frequencies with which each combination appears are plotted on a graph approximately normal distribution will be obtained.
The Characteristics of the normal curve or normal probability curve is as follows.
1) Bell Shaped Curve: The Shape of normal curve is like that of a bell
2) Same numerical value of Mean, Median and Mode; for this curve the Mean Median Mode carry the equal value. Therefore they fall at the same point on the curve.
3) Perfectly Symmetrical: the normal curve is perfectly symmetrical by nature. That means the curve inclines towards both sides equally from the centre of the curve
4) The curve is not skewed, therefore the value of the measure of skew ness is Zero
5) Asymptotic characteristic of the curve ; the normal curve does not touch thebase on OX axis on both the sides . Thus it extends from negative infinty (-∞) to positive (+∞) .
6) Asymptotic characteristic of the curve ; the normal curve does not touch thebase on OX axis on both the sides . Thus it extends from negative infinty (-∞) to positive (+∞) .
7) Distance of the curve ; For Practical purpose the base line of the curve is devided into Six Sigma distance from -3σ to +3 σ . Most of the case i.e. 99.73 % are covered within such distance.
8) The curve is unimodal: since the Mean , Median and Mode lie at one point of the curve it is unimodal in nature
Significance of Normal Probability Curve
Significance of Normal Probability Curve is as under;
1) Normal curve is very helpful in educational evaluation and measurement as it provides relative position of individual in a group.
2) Normal curve can be used as a scale of measurement in behavioral science.
Correlation
Simpson & Kafka – “Correlation analysis deals with the association between two or more variables
Croxton & Cowden – “When the relationship is of a quantitative nature the appropriate statistical tool for discovering and measuring the relationship and expressing it in brief formula is known as correlation.
Thus correlation is the relationship between two variables in which change of one variable affects the other variables. The change may be towards positive or negative direction. When price rise demand falls. Here rise in price causes decrease in demand. Here the relationship between two variables is negative.
When population rate increases the rate of expenditure also increases. Here the relationship between the above variable is positive which is otherwise known as positive correlation.
Where change is one variable least affects change in other variable there the relationship between the variables categorized under zero relation. Growth of a child least affects the number of hands or fingers. Here the member of fingers has zero relationship with the growth of a child. Similarly increase in number of student least affects the number or quality of the contents in the courses of study. Here the quality of the course and the number of students are negatively correlated.
-1 --- Highly negative correlation
+1 --- Highly Positive correlation
Example: Calculation of ρ (rho) when no ties exist
X
Y
R1
R2
D
Squared
D2
Substituting these values in the formula